38 research outputs found
Escaping Saddle Points with Adaptive Gradient Methods
Adaptive methods such as Adam and RMSProp are widely used in deep learning
but are not well understood. In this paper, we seek a crisp, clean and precise
characterization of their behavior in nonconvex settings. To this end, we first
provide a novel view of adaptive methods as preconditioned SGD, where the
preconditioner is estimated in an online manner. By studying the preconditioner
on its own, we elucidate its purpose: it rescales the stochastic gradient noise
to be isotropic near stationary points, which helps escape saddle points.
Furthermore, we show that adaptive methods can efficiently estimate the
aforementioned preconditioner. By gluing together these two components, we
provide the first (to our knowledge) second-order convergence result for any
adaptive method. The key insight from our analysis is that, compared to SGD,
adaptive methods escape saddle points faster, and can converge faster overall
to second-order stationary points.Comment: Update Theorem 4.1 and proof to use martingale concentration bounds,
i.e. matrix Freedma
Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks
Leveraging new data sources is a key step in accelerating the pace of
materials design and discovery. To complement the strides in synthesis planning
driven by historical, experimental, and computed data, we present an automated
method for connecting scientific literature to synthesis insights. Starting
from natural language text, we apply word embeddings from language models,
which are fed into a named entity recognition model, upon which a conditional
variational autoencoder is trained to generate syntheses for arbitrary
materials. We show the potential of this technique by predicting precursors for
two perovskite materials, using only training data published over a decade
prior to their first reported syntheses. We demonstrate that the model learns
representations of materials corresponding to synthesis-related properties, and
that the model's behavior complements existing thermodynamic knowledge.
Finally, we apply the model to perform synthesizability screening for proposed
novel perovskite compounds.Comment: Added new funding support to the acknowledgments section in this
versio
Pan-cancer analysis of whole genomes
Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale(1-3). Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter(4); identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation(5,6); analyses timings and patterns of tumour evolution(7); describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity(8,9); and evaluates a range of more-specialized features of cancer genomes(8,10-18).Peer reviewe
Recommended from our members
Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing
Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene amplification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer
Learning and optimization in the face of data perturbations
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references (pages 145-163).Many problems in the machine learning pipeline boil down to maximizing the expectation of a function over a distribution. This is the classic problem of stochastic optimization. There are two key challenges in solving such stochastic optimization problems: 1) the function is often non-convex, making optimization difficult; 2) the distribution is not known exactly, but may be perturbed adversarially or is otherwise obscured. Each issue is individually so challenging to warrant a substantial accompanying body of work addressing it, but addressing them simultaneously remains difficult. This thesis addresses problems at the intersection of non-convexity and data perturbations. We study the intersection of the two issues along two dual lines of inquiry: first, we build perturbation-aware algorithms with guarantees for non-convex problems; second, we seek to understand how data perturbations can be leveraged to enhance non-convex optimization algorithms. Along the way, we will study new types of data perturbations and seek to understand their connection to generalization.by Matthew James Staib.Ph. D.Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc